5 research outputs found
StairNet: Top-Down Semantic Aggregation for Accurate One Shot Detection
One-stage object detectors such as SSD or YOLO already have shown promising
accuracy with small memory footprint and fast speed. However, it is widely
recognized that one-stage detectors have difficulty in detecting small objects
while they are competitive with two-stage methods on large objects. In this
paper, we investigate how to alleviate this problem starting from the SSD
framework. Due to their pyramidal design, the lower layer that is responsible
for small objects lacks strong semantics(e.g contextual information). We
address this problem by introducing a feature combining module that spreads out
the strong semantics in a top-down manner. Our final model StairNet detector
unifies the multi-scale representations and semantic distribution effectively.
Experiments on PASCAL VOC 2007 and PASCAL VOC 2012 datasets demonstrate that
StairNet significantly improves the weakness of SSD and outperforms the other
state-of-the-art one-stage detectors
T2FPV: Constructing High-Fidelity First-Person View Datasets From Real-World Pedestrian Trajectories
Predicting pedestrian motion is essential for developing socially-aware
robots that interact in a crowded environment. While the natural visual
perspective for a social interaction setting is an egocentric view, the
majority of existing work in trajectory prediction has been investigated purely
in the top-down trajectory space. To support first-person view trajectory
prediction research, we present T2FPV, a method for constructing high-fidelity
first-person view datasets given a real-world, top-down trajectory dataset; we
showcase our approach on the ETH/UCY pedestrian dataset to generate the
egocentric visual data of all interacting pedestrians. We report that the
bird's-eye view assumption used in the original ETH/UCY dataset, i.e., an agent
can observe everyone in the scene with perfect information, does not hold in
the first-person views; only a fraction of agents are fully visible during each
20-timestep scene used commonly in existing work. We evaluate existing
trajectory prediction approaches under varying levels of realistic perception
-- displacement errors suffer a 356% increase compared to the top-down, perfect
information setting. To promote research in first-person view trajectory
prediction, we release our T2FPV-ETH dataset and software tools
SafeShift: Safety-Informed Distribution Shifts for Robust Trajectory Prediction in Autonomous Driving
As autonomous driving technology matures, safety and robustness of its key
components, including trajectory prediction, is vital. Though real-world
datasets, such as Waymo Open Motion, provide realistic recorded scenarios for
model development, they often lack truly safety-critical situations. Rather
than utilizing unrealistic simulation or dangerous real-world testing, we
instead propose a framework to characterize such datasets and find hidden
safety-relevant scenarios within. Our approach expands the spectrum of
safety-relevance, allowing us to study trajectory prediction models under a
safety-informed, distribution shift setting. We contribute a generalized
scenario characterization method, a novel scoring scheme to find subtly-avoided
risky scenarios, and an evaluation of trajectory prediction models in this
setting. We further contribute a remediation strategy, achieving a 10% average
reduction in prediction collision rates. To facilitate future research, we
release our code to the public: github.com/cmubig/SafeShif